青光眼是最严重的眼部疾病之一,其特征是快速进展,导致不可逆的失明。通常,由于疾病早期缺乏明显的症状,人们的视力已经显着降解时,进行诊断。人口的常规青光眼筛查应改善早期检测,但是,由于手动诊断对有限的专家施加的过多负载,词源检查的理想频率通常是不可行的。考虑到检测青光眼的基本方法是分析视轴与光检查比率的底面图像,机器学习算法可以为图像处理和分类提供复杂的方法。在我们的工作中,我们提出了一种先进的图像预处理技术,并结合了深层分类模型的多视图网络,以对青光眼进行分类。我们的青光眼自动化视网膜检测网络(Gardnet)已在鹿特丹Eyepacs Airogs数据集上成功测试,AUC为0.92,然后在RIM-ONE DL数据集上进行了微调,并在AUC上进行了测试,并在AUC上胜过0.9308的AUC。 - 0.9272。我们的代码将在接受后在GitHub上提供。
translated by 谷歌翻译
The task of locating and classifying different types of vehicles has become a vital element in numerous applications of automation and intelligent systems ranging from traffic surveillance to vehicle identification and many more. In recent times, Deep Learning models have been dominating the field of vehicle detection. Yet, Bangladeshi vehicle detection has remained a relatively unexplored area. One of the main goals of vehicle detection is its real-time application, where `You Only Look Once' (YOLO) models have proven to be the most effective architecture. In this work, intending to find the best-suited YOLO architecture for fast and accurate vehicle detection from traffic images in Bangladesh, we have conducted a performance analysis of different variants of the YOLO-based architectures such as YOLOV3, YOLOV5s, and YOLOV5x. The models were trained on a dataset containing 7390 images belonging to 21 types of vehicles comprising samples from the DhakaAI dataset, the Poribohon-BD dataset, and our self-collected images. After thorough quantitative and qualitative analysis, we found the YOLOV5x variant to be the best-suited model, performing better than YOLOv3 and YOLOv5s models respectively by 7 & 4 percent in mAP, and 12 & 8.5 percent in terms of Accuracy.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
在这项工作中,我们专注于改善图像捕获生成系统生成的字幕。我们提出了一种新型的重新排列方法,该方法利用视觉声音措施来确定最大程度地捕获图像中视觉信息的理想标题。我们的重新级别使用了信念修订框架(Blok等,2003),通过明确利用所描绘的标题和视觉上下文之间的语义相关性来校准顶级字幕的原始可能性。我们的实验证明了我们方法的实用性,我们观察到我们的重新级别可以增强典型的图像捕获系统的性能,而无需进行任何额外的培训或微调。
translated by 谷歌翻译
由于计算机视觉的最新进展,流量视频数据已成为限制交通拥堵状况的关键因素。这项工作为使用颜色编码方案提供了一种独特的技术,用于在深度卷积神经网络中训练流量数据之前。首先,将视频数据转换为图像数据集。然后,使用您只看一次算法进行车辆检测。已经采用了颜色编码的方案将图像数据集转换为二进制图像数据集。这些二进制图像被馈送到深度卷积神经网络中。使用UCSD数据集,我们获得了98.2%的分类精度。
translated by 谷歌翻译
步行是人类陆地运动的最常见模式之一。步行对于人类进行大多数日常活动至关重要。当一个人走路时,其中有一个模式,被称为步态。步态分析用于体育和医疗保健。我们可以以不同的方式分析该步态,例如使用监视摄像机捕获的视频或在实验室环境中的深度图像摄像机。它也可以通过可穿戴传感器识别。例如,加速度计,力传感器,陀螺仪,柔性旋转仪,磁电阻传感​​器,电磁跟踪系统,力传感器和肌电图(EMG)。通过这些传感器进行分析需要实验室条件,否则用户必须佩戴这些传感器。为了检测人的步态作用异常,我们需要分别合并传感器。我们可以在发现后通过异常步态知道自己的健康状况。了解常规的步态与异常步态可能会使用智能可穿戴技术对受试者的健康状况有所了解。因此,在本文中,我们提出了一种通过智能手机传感器分析异常步态的方法。尽管如今,大多数人都使用了智能手机和智能手表等智能设备。因此,我们可以使用这些智能可穿戴设备的传感器来追踪他们的步态。
translated by 谷歌翻译
Geosteering of wells requires fast interpretation of geophysical logs, which is a non-unique inverse problem. Current work presents a proof-of-concept approach to multi-modal probabilistic inversion of logs using a single evaluation of an artificial deep neural network (DNN). A mixture density DNN (MDN) is trained using the "multiple-trajectory-prediction" (MTP) loss functions, which avoids mode collapse typical for traditional MDNs, and allows multi-modal prediction ahead of data. The proposed approach is verified on the real-time stratigraphic inversion of gamma-ray logs. The multi-modal predictor outputs several likely inverse solutions/predictions, providing more accurate and realistic solutions than a deterministic regression using a DNN. For these likely stratigraphic curves, the model simultaneously predicts their probabilities, which are implicitly learned from the training geological data. The stratigraphy predictions and their probabilities obtained in milliseconds from the MDN can enable better real-time decisions under geological uncertainties.
translated by 谷歌翻译
The performance of the Deep Learning (DL) models depends on the quality of labels. In some areas, the involvement of human annotators may lead to noise in the data. When these corrupted labels are blindly regarded as the ground truth (GT), DL models suffer from performance deficiency. This paper presents a method that aims to learn a confident model in the presence of noisy labels. This is done in conjunction with estimating the uncertainty of multiple annotators. We robustly estimate the predictions given only the noisy labels by adding entropy or information-based regularizer to the classifier network. We conduct our experiments on a noisy version of MNIST, CIFAR-10, and FMNIST datasets. Our empirical results demonstrate the robustness of our method as it outperforms or performs comparably to other state-of-the-art (SOTA) methods. In addition, we evaluated the proposed method on the curated dataset, where the noise type and level of various annotators depend on the input image style. We show that our approach performs well and is adept at learning annotators' confusion. Moreover, we demonstrate how our model is more confident in predicting GT than other baselines. Finally, we assess our approach for segmentation problem and showcase its effectiveness with experiments.
translated by 谷歌翻译
Recent advances in upper limb prostheses have led to significant improvements in the number of movements provided by the robotic limb. However, the method for controlling multiple degrees of freedom via user-generated signals remains challenging. To address this issue, various machine learning controllers have been developed to better predict movement intent. As these controllers become more intelligent and take on more autonomy in the system, the traditional approach of representing the human-machine interface as a human controlling a tool becomes limiting. One possible approach to improve the understanding of these interfaces is to model them as collaborative, multi-agent systems through the lens of joint action. The field of joint action has been commonly applied to two human partners who are trying to work jointly together to achieve a task, such as singing or moving a table together, by effecting coordinated change in their shared environment. In this work, we compare different prosthesis controllers (proportional electromyography with sequential switching, pattern recognition, and adaptive switching) in terms of how they present the hallmarks of joint action. The results of the comparison lead to a new perspective for understanding how existing myoelectric systems relate to each other, along with recommendations for how to improve these systems by increasing the collaborative communication between each partner.
translated by 谷歌翻译
Nowadays, the current neural network models of dialogue generation(chatbots) show great promise for generating answers for chatty agents. But they are short-sighted in that they predict utterances one at a time while disregarding their impact on future outcomes. Modelling a dialogue's future direction is critical for generating coherent, interesting dialogues, a need that has led traditional NLP dialogue models that rely on reinforcement learning. In this article, we explain how to combine these objectives by using deep reinforcement learning to predict future rewards in chatbot dialogue. The model simulates conversations between two virtual agents, with policy gradient methods used to reward sequences that exhibit three useful conversational characteristics: the flow of informality, coherence, and simplicity of response (related to forward-looking function). We assess our model based on its diversity, length, and complexity with regard to humans. In dialogue simulation, evaluations demonstrated that the proposed model generates more interactive responses and encourages a more sustained successful conversation. This work commemorates a preliminary step toward developing a neural conversational model based on the long-term success of dialogues.
translated by 谷歌翻译